Novel pfk13 polymorphisms in Plasmodium falciparum population in Ghana

The molecular determinants of Plasmodium falciparum artemisinin resistance are the single nucleotide polymorphisms in the parasite’s kelch propeller domain, pfk13. Validated and candidate markers are under surveillance in malaria endemic countries using artemisinin-based combination therapy. However, pfk13 mutations which may confer parasite artemisinin resistance in Africa remains elusive. It has therefore become imperative to report all observed pfk13 gene polymorphisms in malaria therapeutic efficacy studies for functional characterization. We herein report all novel pfk13 mutations observed only in the Ghanaian parasite population. In all, 977 archived samples from children aged 12 years and below with uncomplicated malaria from 2007 to 2017 were used. PCR/Sanger sequencing analysis revealed 78% (763/977) of the samples analyzed were wild type (WT) for pfk13 gene. Of the 214 (22%) mutants, 78 were novel mutations observed only in Ghana. The novel SNPs include R404G, P413H, N458D/H/I, C473W/S, R529I, M579T/Y, C580R/V, D584L, N585H/I, Q661G/L. Some of the mutations were sites and ecological zones specific. There was low nucleotide diversity and purifying selection at the pfk13 locus in Ghanaian parasite population. With increasing drug pressure and its consequent parasite resistance, documenting these mutations as baseline data is crucial for future molecular surveillance of P. falciparum resistance to artemisinin in Ghana.

www.nature.com/scientificreports/ population because these could be of interest in the future. These SNPs have not been reported from any country as revealed from searches in published articles from PUBMED up to the date of submitting this article.

Results
Twenty-two percent of the total number of samples (214/977) had pfk13 mutations, of which 78 were unique SNPs and 95% of those were non-synonymous. Mutations were observed in 63 codons and ranged from one SNP per codon to three SNPs per codon (N458D, N458I, N458H). Most of the novel SNPs were seen in only one sample (frequency of 0.47%). The coastal zone consisting of Accra and Cape-Coast (which are also urban areas) had more novel SNPs than the forest (having 6 sites-Begoro, Bekwai, Koforidua, Hohoe, Tarkwa, Sunyani). Of the sites in the forest zone, Koforidua had the most novel SNPs compared to other sites of the same zone. All the novel SNPs are shown in Table 1. Unique mutations were observed at the different sites and ecological zones. The ecological zone unique SNPs are, C580R and K669E/N for coastal, M579T/Y and D584L for forest and N554P and A569P for the savannah.
Distribution of mutations in the pfk13 propeller domain in the Ghanaian isolates. Novel SNPs which were unique to the various sites were observed at different domains of the propeller region. The SNPs exclusive to Hohoe were mostly located within the BTB/POZ domain to blade 3 and those of Koforidua were located within blades 3 to 6. SNPs observed in the samples from Cape Coast were located within blades 4, 5 and 6 and those for Accra were found in blades 1, 3 and 5. Of the 78 novel mutations detected, the highest number of mutations were recorded in blade 3 and the least number in blades 2 and 6 as shown in Table 1.
P. falciparum k13 gene showed low diversity and evidence of purifying selection in Ghanaian parasite population. To investigate the diversity at the pfk13 locus, we determined population genetics metrics of DNA polymorphism using the 792 sequences in total. Overall genetic diversity at the pfk13 locus was low (π = 0.00383) ( Table 2) and indicates that the gene locus sequence among the 792 samples analyzed was largely similar. This similarity or low genetic diversity did not change when analyzed per location, year, or ecological zone (

Discussion
The need to report all observed SNPs in the pfk13 gene is important especially when the molecular markers for resistance in Africa are yet to be revealed. From this study, sequence analysis revealed a number of novel SNPs observed only in the Ghanaian parasite population over a decade. Although the mutations are many in different codons of the gene locus, the frequencies were low and the computational DNA analysis showed low nucleotide diversity in the population which is under purifying selection. Our previous paper has already reported mutations seen in Ghanaian isolates that have been observed elsewhere including variants of some of the validated and candidate markers of ART resistance 14 . Functional characterisation using CRISPR Genome Editing Technology followed by Ring Stage Survival Assay (RSA) of two clones with one novel mutation, C580R; C580R_1 and C580R_2 showed parasite survival rates of 18% and 14% respectively and that of the validated marker, C580Y, was 28% in the same experiment (OCK Hagan et al., data yet to be published). The findings of this experiment support the fact that all observed mutations in pfk13 could be potential markers of drug resistance and therefore must be documented. The unique mutations observed in the parasite population of Ghana were not shared, even among sites of the same ecological zone and could be a reflection of minimum gene flow between the sites within each zone 12 . This observation corroborates the findings from data available on resistance to ART. The data do not show a cluster of mutations geographically and there is lack of sharing of common mutations among parasite populations thereby resulting in regional diversity [15][16][17] . The novel mutations are as a result of genetic recombination and localised evolution of the gene, which is a consequence of high transmission intensity. The differences in the transmission patterns 18-20 could be a probable explanation to the observed genetic variability. Inadvertently, most of the SNPs were observed in one sample and only a few were seen in 2 or 3 samples. The fact that they were non-synonymous mutations could also be affecting the fitness cost of the parasites and may not necessarily be linked to drug resistance. In addition, it could be an evidence of the start of an independent emergence of pfk13 mutations in Ghana as observed in the parasite population of Guyana 12,16 .
The mutations in the Ghanaian isolates were distributed in all the domains, from the BTB/POZ to blade 6 with variations in sentinel sites located in the same ecological zone. Most mutations were in blade 3 followed by blades 4 and 5 but with low frequencies. The propeller domain is known to be conserved in P. falciparum, however, the mutations observed could be parasite adaptation due to selective pressures of antimalarial drugs use in Africa (fake drugs, noncompliance and presumptive treatment of malaria) 13,21 . The large pool of low frequency genetic mutations could help with the emergence of resistance faster than anticipated due to increasing drug pressure from ACT use 22 . Unlike the high frequency of non-synonymous mutations in parasites of the SEA region moving from intermediate to fixation levels, those of Africa occur at very low frequencies with high allelic variation 23 .
Nucleotide diversity (π) at the pfk13 locus can be considered an indirect measure of the potential for the selection of an ART tolerant variant. A high π at the pfk13 suggests sufficient diversity for a soft or hard selection  www.nature.com/scientificreports/ sweep on the locus. In contrast, a low π suggests a reduced probability for a selection sweep on the pfk13 locus. The finding of low diversity at the pfk13 locus in this regard suggests that the risk of a tolerant pfk13 variant emerging between 2007 and 2017 was low. It is also evident that pfk13 is largely conserved in the P. falciparum population of Ghana. This lack of diversity at the pfk13 locus may be due to the fitness cost of any new variant. Within the context of relatively high transmissions that correlate with higher sexual outcrossing in the mosquito vector and thus the breakdown by recombination of any nascent pfk13 variant/haplotype, our findings are expected. Other factors that might mitigate against high diversity in the pfk13 locus include the prevalence of human malaria immunity and within-host multiplicity of infection/competition. These factors may act to negatively select emerging ART tolerant variants segregating in our population as portrayed by the results. The finding of negative Tajima's D may also suggest recent population expansion with multiple low-frequency variants. This presence of several variants at low frequencies contributes to the haplotype diversity observed in the analysis. Additionally, the findings of low nucleotide diversity and purifying selection at the pfk13 locus is congruent with the findings of a similar study that investigated the evolution and genetic diversity of the pfk13 gene 24 .

Conclusion
A change in genetic composition and the resultant change in amino acids affects protein function. The observation of numerous novel mutations which are non-synonymous with low frequencies is indicative of the development of a nascent resistance at the genotypic level yet to be revealed as phenotypic traits in Ghanaian parasites.  www.nature.com/scientificreports/ Table 2. Summary of computational analysis of DNA polymorphisms found in pfk13 in Ghanaian isolates by location, year and ecological zones. The computational analysis of the sequences to reveal the nucleotide diversity of the mutations in the pfk13 gene in Ghanaian isolates for study sites, year and ecological zones. Snumber of segregating sites in the gene; π-nucleotide diversity at the gene locus.   www.nature.com/scientificreports/ The current reported efficacies of ACTs is above 95% 25 which is quite high as compared to some countries in the region. The novel mutations would be monitored continuously and functional characterization would be performed on those with increasing frequencies over time to establish their role in parasite resistance to ACTs in Ghana.

Methods
Study sites and population. Archived samples from therapeutic efficacy studies (TES) conducted in sentinel sites in three different ecological zones of Ghana namely coastal, forest and savannah were used for the study. Perennial transmission of malaria occurs in the coastal and forest zones and seasonal malaria transmission occurs in the savannah zone. The sentinel sites are Accra, Begoro, Bekwai, Cape-Coast, Hohoe, Koforidua, Navrongo, Sunyani, Tarkwa, Yendi and Wa (Fig. 4). Accra and Cape-Coast lie in the coastal savannah zone; Navrongo, Yendi and Wa lie in the guinea savannah zone; Begoro, Bekwai, Koforidua, Sunyani, Hohoe and Tarkwa lie in the forest zone. The information on the study sites is well documented in Matrevi et al. 14 .
Samples and molecular analysis. Archived filter paper blood blots, prepared from children 12 years and below reporting at the clinic with uncomplicated malaria from 2007 to 2017 malaria transmission season were used. The parents/guardians of the children gave informed consent for their participation in the studies. The consent also covered the future use of the archived samples for further molecular analysis. DNA was extracted using a QIAamp DNA Mini Kit (QIAGEN, Germany) following the manufacturer's protocol. Targeted portion of pfk13 gene was amplified using the nested PCR protocol by Talundzic et al. 26 with minor modifications. Positively amplified samples were Sanger sequenced by Macrogen, Europe (Netherlands).
Sequence analysis. Obtained sequences from the pfk13 genes were submitted to the standard nucleotide basic local alignment search tool (BLAST) database search program of the National Center for Biotechnology Information (NCBI) website to determine the authenticity of the sequences. The sequences were then aligned using 3D7 wild type pfk13 sequence (PF3D7_1343700) for reference obtained from PlasmoDB (www. Plasm odb. org). Sequences were edited using BioEdit ClustalW Multiple Sequence Alignment Software. They were further analysed using CLC Main Workbench 20.04 software (Qiagen, Aarhus, Denmark) and Benchling.com (San Francisco, CA, USA). Other single nucleotide polymorphisms were searched for using PubMed tool for new SNPs published by other researchers.
Computational pipeline for population genetics analysis of pfk13 gene. Base-calling, alignment, and deconvolution of Sanger chromatogram trace files were done using the command-line version of the application Tracy 27 . The output binary variant call format (bcf) files for each sample were converted to human-readable variant call format (vcf) files using custom bash scripts. Low-quality variants (< 40) and indels were filtered